Skip to content

Conversation

@MatthewBonanni
Copy link
Contributor

@MatthewBonanni MatthewBonanni commented Sep 23, 2025

Purpose

Remove _VLLM_V1 suffixes as these are no longer needed. If a user includes that suffix in their VLLM_ATTENTION_BACKEND environment variable, it is stripped and a warning is printed.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively removes the _VLLM_V1 suffixes from attention backend names, which is a great cleanup now that V0 backends are deprecated. The changes are consistent across the codebase, covering test files, configuration, and backend implementations. I appreciate the addition of backward compatibility in vllm/attention/selector.py to handle the old suffixes from environment variables with a warning. I've found one minor area for improvement to simplify a redundant conditional check.

Copy link
Member

@tlrmchlsmth tlrmchlsmth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change right? We should put in a deprecation notice + fallback for at least a release

Comment on lines +198 to +196
if backend_by_env_var.endswith("_VLLM_V1"):
logger.warning(
"The suffix '_VLLM_V1' in the environment variable "
"%s is no longer necessary as V0 backends have been "
"deprecated. Please remove this suffix from your "
"environment variable setting.", STR_BACKEND_ENV_VAR)
backend_by_env_var = backend_by_env_var.removesuffix(
"_VLLM_V1")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlrmchlsmth It shouldn't be breaking; these lines should prevent issues

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect

@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025
@mgoin
Copy link
Member

mgoin commented Sep 24, 2025

Failures are related, PTAL

MatthewBonanni and others added 2 commits September 24, 2025 15:45
…ction logic)

Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the test changes, thanks a ton!

@mgoin mgoin enabled auto-merge (squash) September 24, 2025 22:47
@mgoin
Copy link
Member

mgoin commented Sep 25, 2025

The v1 tests look related too

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
auto-merge was automatically disabled September 25, 2025 13:14

Head branch was pushed to by a user without write access

@MatthewBonanni
Copy link
Contributor Author

@mgoin Thanks - I removed that test case too (there are no longer any V0 backends to fall back to). Eventually we'll have to remove _is_v1_supported_oracle and that full test_oracle.py file, but that'll be a different PR

@MatthewBonanni
Copy link
Contributor Author

Created #25673 to remove the oracle

@MatthewBonanni
Copy link
Contributor Author

@mgoin V1 Test e2e + engine is failing for backend FLASHINFER. This is also a test that should be failing on main. It was previously running with backend FLASHINFER_VLLM_V1, which didn't exist, so it'd fall back to FlashAttention, which passes (see recent nightly). Now, it correctly runs with FlashInfer, which fails. Not sure what to do about this one.

@ProExpertProg
Copy link
Collaborator

@MatthewBonanni is it possible to fix the test? Otherwise you can make an issue and disable it

@MatthewBonanni
Copy link
Contributor Author

MatthewBonanni commented Sep 25, 2025

@ProExpertProg It looks to me like a correctness issue with the FI backend outside the scope of this PR, so I'll disable it and make an issue. Disabling it shouldn't do too much harm because it was effectively disabled before.

EDIT: The issue is #25679

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@ProExpertProg ProExpertProg enabled auto-merge (squash) September 25, 2025 17:21
@ProExpertProg ProExpertProg merged commit 3468f17 into vllm-project:main Sep 25, 2025
53 checks passed
@MatthewBonanni MatthewBonanni deleted the remove_suffix branch September 25, 2025 18:03
xuechendi added a commit to vllm-project/vllm-gaudi that referenced this pull request Sep 25, 2025
vllm-project/vllm#25489

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
iboiko-habana pushed a commit to iboiko-habana/vllm-gaudi that referenced this pull request Oct 2, 2025
vllm-project/vllm#25489

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
#25489)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
vllm-project#25489)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
vllm-project#25489)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
vllm-project#25489)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
vllm-project#25489)

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build kv-connector ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm speculative-decoding tpu Related to Google TPUs v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants